ConvertFromUnicodeToText

Inside Macintosh: Programming With the Text Encoding Conversion Manager /

Chapter 4 - Unicode Converter Reference / Unicode Converter Functions

ConvertFromUnicodeToText
Converts a string from Unicode to the specified encoding.
pascal OSStatus ConvertFromUnicodeToText ( 
                     UnicodeToTextInfo iUnicodeToTextInfo, 
                     ByteCount iUnicodeLen, 
                     ConstUniCharArrayPtr iUnicodeStr,
                     OptionBits iControlFlags, 
                     ItemCount iOffsetCount, 
                     ByteOffset iOffsetArray[], 
                     ItemCount *oOffsetCount, 
                     ByteOffset oOffsetArray[], 
                     ByteCount iOutputBufLen, 
                     ByteCount *oInputRead, 
                     ByteCount *oOutputLen, 
                     LogicalAddress oOutputStr);
iUnicodeToTextInfo
A Unicode converter object of type UnicodeToTextInfo for converting text from Unicode. You use the function CreateUnicodeToTextInfo (page 135) or CreateUnicodeToTextInfoByEncoding (page 136) to obtain a Unicode converter object to specify for this parameter.
iUnicodeLen
The length in bytes of the Unicode string to be converted.
iUnicodeStr
A pointer to the Unicode string to be converted. If the input text is UTF-8, which is supported for versions 1.2.1 or later of the converter, you must cast the UTF-8 buffer pointer to ConstUniCharArrayPtr before you can pass it as this parameter.
iControlFlags
Conversion control flags. You can use these bitmasks to set the control flags that apply to this function:
kUnicodeUseFallbacksMask kUnicodeKeepInfoMask kUnicodeVerticalFormMask kUnicodeLooseMappingsMask kUnicodeStringUnterminatedMask kUnicodeForceASCIIRangeMask kUnicodeNoHalfwidthCharsMask
one of the following directionality masks:
kUnicodeDefaultDirectionMask kUnicodeLeftToRightMask kUnicodeRightToLeftMask

For a description of these control flags, see "Conversion Control Flags" (page 110).
iOffsetCount
The number of offsets contained in the array provided by the iOffsetArray parameter. Your application supplies this value. If you don't want offsets returned to you, specify 0 (zero) for this parameter.
iOffsetArray
An array of type ByteOffset. On input, you specify the array that gives an ordered list of significant byte offsets pertaining to the Unicode source string to be converted. These offsets may identify font or style changes, for example, in the source string. If you don't want offsets returned to your application, specify NULL for this parameter and 0 (zero) for iOffsetCount. All offsets must be less than iUnicodeLen.
oOffsetCount
A pointer to an ItemCount. On output, this value contains the number of offsets that were mapped in the output stream.
oOffsetArray
An array of type ByteOffset. On output, this array contains the corresponding new offsets for the converted string in the new encoding.
iOutputBufLen
The length in bytes of the output buffer pointed to by the oOutputStr parameter. Your application supplies this buffer to hold the returned converted string. The oOutputLen parameter may return a byte count that is less than this value if the converted byte string is smaller than the buffer size you allocated.
oInputRead
A pointer to a value of type ByteCount. On output, this value gives the number of bytes of the Unicode string that were converted. If the function returns a kTECUnmappableElementErr result code, this parameter returns the number of bytes that were converted before the error occurred.
oOutputLen
A pointer to a value of type ByteCount. On output, this value give the length in bytes of the converted text stream.
oOutputStr
A value of type LogicalAddress. On input, this value points to a buffer for the converted string. On output, the buffer holds the converted text string. (For guidelines on estimating the size of the buffer needed, see the following discussion.)
function result
A result code. The function returns a noErr result code if it has completely converted the Unicode string to the destination encoding without using fallback characters. If the function returns the paramErr or kTECGlobalsUnavailableErr result codes, it did not convert the string.

If the function returns kTECTableFormatErr, the code encountered a table in an unknown format. The function did not completely convert the input string (and may not have converted any of it).

If the function returns kTECBufferBelowMinimumSizeErr, the output buffer was too small to allow conversion of any part of the input string. You need to increase the size of the output buffer and try again.

If the function returns the kTECUsedFallbacksStatus result code, the function has completely converted the string using one or more fallback characters. This can only happen if you set the Unicode-use-fallbacks control flag.

If the function returns kTECOutputBufferFullErr, the output buffer was not big enough to completely convert the input; oInputRead indicates the amount of input converted. You can call the function again with another output buffer (or with the same output buffer, after copying its contents) to convert the remainder of the Unicode string.

If the function returns kTECPartialCharErr, the Unicode input string ended with an incomplete UTF-8 character (can only happen for UTF-8 input). If you have subsequent input text available, you can append the unconverted input from this call to the beginning of the subsequent input text and call the function again.

If the function returns kTECUnmappableElementErr, an input text element could not be mapped to the destination encoding. The function did not completely convert the Unicode input string. This can only happen if you did not set the Unicode-use-fallbacks control flag. You can set this flag and convert the remaining unconverted input, or take some other action.

If the function returns kTextUndefinedElementErr, the Unicode input string included a value which is undefined for the specified Unicode version. The function did not completely convert the input string, and fallback handling will not be invoked. You can resume conversion from a point beyond the offending Unicode character, or take some other action.

If the function returns kTextIncompleteElementErr, then either the input string included a text element which is too long for the internal buffers, or the input string ended with a text element which may be incomplete (this latter case can only happen if you set the kUnicodeStringUnterminatedMask control flag). The function did not completely convert the input string, and fallback handling will not be invoked.

For additional information, see "Text Encoding Conversion Manager Result Codes" (page 42) in the chapter "Basic Text Types Reference."

DISCUSSION
The ConvertFromUnicodeToText function converts a Unicode text string to the destination encoding you specify in the Unicode mapping structure that you pass to the function CreateUnicodeToTextInfo (page 135) or CreateUnicodeToTextInfoByEncoding (page 136) when you call them to obtain a Unicode converter object for the conversion process. You pass the returned object to ConvertFromUnicodeToText as the iUnicodeToTextInfo parameter.
In addition to converting the Unicode string, ConvertFromUnicodeToText can map offsets for style or font information from the source text string to the returned converted string. The converter reads the application-supplied offsets and returns the corresponding new offsets in the converted string. If you do not want font or style information offsets mapped to the resulting string, you should pass NULL for iOffsetArray and 0 (zero) for iOffsetCount.
Your application must allocate a buffer to hold the resulting converted string and pass a pointer to the buffer in the oOutputStr parameter. To determine the size of the output buffer to allocate, you should consider the size and content of the Unicode source string in relation to the type of encoding to which it will be converted. For example, for many encodings, such as MacRoman and Shift-JIS, the size of the returned string will be between half the size and the same size as the source Unicode string. However, for some encodings that are not Mac OS ones, such as EUC-JP, which has some 3-byte characters for Kanji, the returned string could be larger than the source Unicode string. For MacArabic and MacHebrew, the result will usually be less than half the size of the Unicode string.
This function modifies the contents of the Unicode converter object you passed as the iUnicodeToTextInfo parameter.

SEE ALSO
The function ConvertFromTextToUnicode (page 129)